[논문리뷰] Boosting Standard Classification Architectures Through a Ranking Regularizer - penny4860/study-note GitHub Wiki

정리

Softmax loss에 Triplet loss를 regularization 으로 사용하는 네트워크 구조를 제안.
- conv feature maps 이후에
  - (1st head) : GAP -> fc (n_classes) -> softmax
  - (2nd head) : flatten -> fc (n_embed) -> triplet
Triplet loss
- batch sample 의 embedding space상에서
  - anchor sample과 positive sample의 거리는 작게
  - anchor sample과 negatice sample의 거리는 크게
- batch sampling 방식
  - semi-hard negative sampling 방식이 가장 좋은 성능을 보였음.
classification 성능
- FGVR 에서는 vanilla softmax와 거의 비슷 (1~2% 차이)
- imbalance dataset에서는 꽤 성능향상이 있었음 (6% 이상)
feature embedding 성능
- vanilla softmax의 feature와의 비교는 없었음.

normalized mutual information 에 의한 retrival 성능 평가 metric을 어떻게 구현하지?
- https://scikit-learn.org/stable/modules/generated/sklearn.metrics.normalized_mutual_info_score.html
batch construction 방법이 무슨 말인지 모르겠음.
- K=4? : batch size=32중에서 4종류의 class만 sampling 한다는 건가?
- In defense of the triplet loss for person re-identification 에서의 구성방법 알아보자.
  1. P class를 random으로 선택
  2. 각 class 별로 K개의 image를 선택
  3. P * K anchor sample 에 대해서 hardest positive / hardest negative로 triplet을 구성
    - hardest positive : anchor와 같은 클래스중 가장 거리가 먼것.
    - hardest negative : anchor와 다른 클래스중 가장 거리가 가까운 것.

일반적인 image classification task는 softmax loss 를 사용한다.
sotfmax loss의 한계 : embedding 에 한계
- embedding space상에서 샘플을 잘 분류하는 일만한다.
- embedding space상에서의 margin을 고려하지 않는다.
좋은 embedding?
- 같은 class 끼리는 거리가 가까워야 함.
- 다른 class 끼리는 거리가 멀어야 함.
논문의 contribution
1. 2-head 구조를 제안
  - softmax loss 기반, triplet loss를 regularization loss로 사용
  - 더 좋은 feature embedding을 얻음으로써, 분류기 성능향상
2. triplet loss 에서의 Batch size 평가
  - 작은 batch size로도 좋은 성능을 만들어냄.
3. standard classification 보다 더 좋은 retriaval 성능을 만들어냄.

softmax 기반의 classifier는 intra-class compatness, iner-class maximization을 고려하지 않는다.
Embedding regularization은 이러한 한계를 해결하기 위한 방법이다.

기존 연구에서의 Triplet loss 사용사례
- feature embedding Tool
- object 끼리의 similarity 측정
- clustering mertic
논문에서의 Triplet Loss
- softmax와 함께 regularization loss로 사용
- loss_triplet = D(a, p) - D(a, n) + margin
  - a : anchor sample
  - p : a와 같은 class의 sample
  - n : a와 다른 class의 sample
  - margin = 2
- L = L_soft + lambda*L_tri
  - lambda = 1
batch sampling 방법
- negative sample의 부류
  - anchor -- hard negative -- positive -- semi-hard negative -- margin -- easy negative
  - anchor : 기준점
  - negative
    1. hard : positive보다 가까움.
    2. semi-hard : positive 보다 멀지만 margin 보다 가깝다.
    3. easy : margin보다 멀다. 분류하기 쉬움
- 논문에서는 negative sample을 semi-hard -> easy -> hard를 우선순위로 sampling
  - batch size가 작기때문(b=32)에 semi-hard 또는 hard만을 선택하지 않는다.

Datasets
- Aircraft
- NABirds
- Flower-102
- Stanford Cars
- Stanford Dogs
Baselines
1. softmax
2. 2-head leveraging center loss
  - 논문에서 제안한 2-head 구조
  - regularization loss 를 center loss로 사용
Proposed Method : 2-head + Triplet loss
- batch size : 32
- embedding dimension : 256
- batch sampling
  - hard sampling
  - semi-hard sampling
    - margin = 2
- batch construction procedure
  - ?

평가방법
1. recall
2. normalized mutual information
feature embedding 성능 비교 (Table 5)
- batch sampling 방식
  - Triplet loss + semi-hard sampling
    - 가장 stable
  - Triplet loss + hard sampling
    - resnet에서는 우수하지만, inception에서 매우 unstable
  - center loss
    - 가장 unstable
- 네트워크 구조
  - Densnet -> Resnet -> Inception 순으로 좋음
Triplet + semi-negative sample 을 사용하면 classification 성능이 좋을 수록 retrieval 성능도 좋았음. (Table 6)